Local Optimisation of Nyström Samples Through Stochastic Gradient Descent
نویسندگان
چکیده
Abstract We study a relaxed version of the column-sampling problem for Nyström approximation kernel matrices, where approximations are defined from multisets landmark points in ambient space; such referred to as samples. consider an unweighted variation radial squared-kernel discrepancy (SKD) criterion surrogate classical criteria used assess accuracy; this setting, we discuss how samples can be efficiently optimised through stochastic gradient descent. perform numerical experiments which demonstrate that local minimisation SKD yields with improved accuracy terms trace, Frobenius and spectral norms.
منابع مشابه
Local Gain Adaptation in Stochastic Gradient Descent
Gain adaptation algorithms for neural networks typically adjust learning rates by monitoring the correlation between successive gradients. Here we discuss the limitations of this approach, and develop an alternative by extending Sutton’s work on linear systems to the general, nonlinear case. The resulting online algorithms are computationally little more expensive than other acceleration techni...
متن کاملPreconditioned Stochastic Gradient Descent Optimisation for Monomodal Image Registration
We present a stochastic optimisation method for intensity-based monomodal image registration. The method is based on a Robbins-Monro stochastic gradient descent method with adaptive step size estimation, and adds a preconditioning matrix. The derivation of the pre-conditioner is based on the observation that, after registration, the deformed moving image should approximately equal the fixed ima...
متن کاملADAPTIVITY OF AVERAGED STOCHASTIC GRADIENT DESCENT Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression
In this paper, we consider supervised learning problems such as logistic regression and study the stochastic gradient method with averaging, in the usual stochastic approximation setting where observations are used only once. We show that after N iterations, with a constant step-size proportional to 1/R √ N where N is the number of observations and R is the maximum norm of the observations, the...
متن کاملVariational Stochastic Gradient Descent
In Bayesian approach to probabilistic modeling of data we select a model for probabilities of data that depends on a continuous vector of parameters. For a given data set Bayesian theorem gives a probability distribution of the model parameters. Then the inference of outcomes and probabilities of new data could be found by averaging over the parameter distribution of the model, which is an intr...
متن کاملByzantine Stochastic Gradient Descent
This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2023
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-031-25599-1_10